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Abstract 

Significant  progress  was  made  in  a  number  of  aspects  of  stochastic  and  discrete  event 
systems.  A  controlled  switching  diffusion  model  was  developed  to  study  systems  with  multiple 
modes  or  failure  modes,  such  as  aircraft  with  multiple  operating  modes,  the  hierarchical  control 
of  flexible  manufacturing  systems,  and  large  scale  interconnected  power  networks.  In  addition, 
a  number  of  other  optimal  stochastic  control  problems  with  long  time  horizons  were  solved. 
An  important  problem  in  the  adaptive  control  of  a  finite  state  Markov  chain  was  solved,  and 
significant  progress  was  made  along  more  general  directions.  New  results  on  the  risk-sensitive 
control  of  hidden  Markov  processes  were  obtained.  Motivated  by  apphcations  in  hierarchical 
intelligent  control,  some  important  problems  in  the  area  of  discrete  event  systems  were  solved. 


1.  SUMMARY  OF  RESEARCH  PROGRESS  AND  RESULTS 


Realistic  models  of  many  engineering  systems,  including  those  in  the  fields  of  aerospace 
navigation  and  vehicular  control,  neural  networks,  the  control  of  robotic  manipulators,  com¬ 
munication  system  design,  and  optimal  control,  involve  unknown  parameters,  nonlinearities, 
and  noise  disturbances.  The  design  of  high  performance  control  systems,  in  aerospace  as 
well  as  other  applications,  generally  requires  the  use  of  adaptive  control  techniques  when  the 
parameters  are  unknown  or  may  be  changing.  With  this  motivation,  we  proposed  research 
concerned  with  the  study  of  several  basic  questions  in  the  adaptive  estimation  and  control  of 
stochastic  systems. 

During  the  period  supported  by  this  grant,  we  have  made  significant  progress  both  in  areas 
we  proposed  to  investigate  and  in  related  areas.  In  this  section,  we  summarize  the  progress  in 
those  areas  that  have  resulted  in  publications. 


1.1  Stochastic  Control 

First,  a  controlled  switching  diffusion  model  was  developed  to  study  systems  with  multiple 
modes  or  failure  modes,  such  as  aircraft  with  multiple  operating  modes,  the  hierarchical  control 
of  flexible  manufacturing  systems,  and  large  scale  interconnected  power  networks  [1],  [14],  [24]. 
On  line  implementable  optimal  feedback  policies  were  derived  for  a  discounted  cost  stochastic 
optimization  problem  in  this  setting.  Our  treatment  of  the  optimization  problem  is  based 
on  a  convex  analytic  approach  which  is  interesting  in  its  own  right  and  is  more  flexible  and 
powerful  for  certain  other  purposes,  e.g.  the  pathwise  average  cost  problem  or  problems  with 
several  constraints  in  which  other  approaches  do  not  seem  to  be  amenable.  Using  this  method, 
we  prove  in  [1],  [14],  [24]  the  existence  of  a  homogeneous  Markov  nonrandomized  optimal 
control  law.  Using  the  existence  of  such  a  control  law,  the  existence  of  a  unique  solution  in  a 
certain  class  to  the  associated  Hamilton-Jacobi-Bellman  (HJB)  equations  is  established  and  the 
optimal  control  law  is  characterized  as  a  minimizing  selector  of  an  appropriated  Hamiltonian. 
This  methodology  is  used  to  solve  a  particular  problem  in  the  hierarchical  control  of  flexible 
manufacturing  systems;  in  this  problem,  the  model  involves  a  hybrid  process  in  continuous 
time  whose  state  is  given  by  a  pair  {X{t),S{t)).  Here,  X{t)  denotes  the  downstream  buffer 
stock  of  parts,  which  may  have  a  negative  value  to  indicate  a  backlogged  demand.  The 
continuous  component  X(t)  is  governed  by  a  controlled  diffusion  process  with  a  drift  vector 
which  depends  on  the  discrete  component  S{t).  Thus,  X{t)  switches  from  one  diffusion  path  to 
another  as  the  discrete  component  S{t)  jumps  form  one  state  to  another.  On  the  other  hand, 
the  discrete  component  S{t),  denoting  the  number  of  operational  machines,  is  influenced  by 
the  inventory  size  and  production  scheduling,  and  can  also  be  controlled  by  various  decisions 
such  as  produce,  repair,  replace,  etc.  Hence,  S{t)  evolves  as  a  “controlled  Markov  chain”  with 
a  transition  matrix  depending  on  the  continuous  component.  The  corresponding  average  cost 
optimization  problem  is  considered  in  [12]  and  [19].  Under  certain  conditions,  we  establish 
the  existence  of  stable  Markov  nonrandomized  policy  which  is  almost  surely  optimal  for  the 
pathwise  long-run  avereage  cost  criterion.  We  characterize  the  optimal  policy  as  a  minimizing 
selector  of  the  Hamiltonian  associated  with  the  HJB  equations.  We  apply  these  results  to  the 
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failure  prone  manufacturing  system  problem  and  show  that  the  optimal  production  rate  is  of 
the  hedging  point  type. 

The  optimal  control  of  diffusions  is  considered  in  [2].  Under  a  penalizing  condition  on  the 
cost  or  unstable  behavior  or  under  a  Liapunov  stability  condition,  we  establish  the  existence 
of  a  stable  Markov  control  law  which  is  strong  average  optimal.  We  study  in  [3]  average 
cost  Markov  decision  processes  on  a  countable  state  space,  compact  action  space  and  with 
unbounded  costs;  we  prove  the  existence  of  a  stable  stationary  strategy  which  is  optimal.  In 
[11]  and  [20]  we  study  Markov  decision  processes,  with  an  infinite  planning  horizon,  under  a 
cost  criterion  obtained  as  a  weighted  combination  of  the  discounted  and  long-run  average  cost 
criteria.  In  addition,  a  functional  characterization  is  given  for  overtaking  optimal  policies,  for 
problems  with  countable  state  spaces  and  compadt  control  spaces. 

We  consider  in  [18],  [23]  Markov  decision  processes  with  an  average  cost  criterion,  general 
control  and  state  spaces,  and  unbounded  one-stage  cost  functions.  We  show  how  structural 
properties  in  the  model  can  be  used  to  obtain  a  functional  characterization  of  optimal  values 
and  policies,  in  the  form  of  an  average  cost  optimality  equation  (ACOE).  In  particular,  if  the 
(discounted)  value  functions  are  convex,  the  ACOE  is  obtained  as  a  limit  of  the  corresponding 
discounted  optimality  equations.  We  further  comment  on  the  potential  algorithmic  impact 
of  this  and  other  structured  solutions,  and  the  apphcation  of  the  results  to,  e.g.,  inventory 
control  problems. 

In  [21]  we  consider  a  risk-sensitive  optimal  control  problem  for  hidden  Markov  models 
(HMM).  Building  upon  recent  results  by  Baras,  James  and  Elhott,  we  investigate  the  structure 
of  risk-sensitive  controllers  for  HMM,  via  an  examination  of  a  popular  benchmark  problem. 
We  obtain  new  results  on  the  structure  of  the  risk-sensitive  controller  by  first  proving  concavity 
and  piecewise  linearity  of  the  value  function.  Furthermore,  we  compare  the  structure  of  risk- 
sensitive  and  risk-neutral  controllers. 

In  [4],  an  invited  survey  paper  which  appeared  in  a  special  issue  of  the  SIAM  Journal  on 
Control  and  Optimization  dedicated  to  Prof.  Wendell  Fleming,  we  present  a  comprehensive 
survey  of  the  average  cost  control  problem  for  discrete-time  Markov  processes.  Our  exposition 
covers  from  finite  to  Borel  state  and  action  spaces  and  includes  a  variety  of  methodologies  to 
find  and  characterize  optimal  policies. 

As  a  prelude  to  studying  adaptive  control,  the  problem  of  characterizing  the  effects  that 
uncertainties  and/or  small  changes  in  the  parameters  of  a  model  can  have  on  optimal  policies 
is  considered  in  [5].  It  is  shown  that  changes  in  the  optimal  policy  are  very  difficult  to  detect, 
even  for  relatively  simple  models.  By  showing  for  a  machine  replacement  problem  modeled  by 
a  partially  observed,  finite  state  Markov  decision  process,  that  the  infinite  horizon,  optimal 
discounted  cost  function  is  piecewise  linear,  we  have  derived  formulas  for  the  optimal  cost  and 
the  optimal  policy,  thus  providing  a  means  for  carrying  out  sensitivity  analyses. 

The  stochastic  adaptive  control  of  finite  state  Markov  chains  with  incomplete  state  obser¬ 
vations  and  unknown  parameters  is  investigated  in  [6],  [7],  [22];  in  particular,  we  have  studied 
certain  classes  of  quality  control,  replacement,  and  repair  problems.  The  general  problem  is 
solved,  as  well  as  a  particular  application  in  manufacturing;  in  addition,  there  are  implications 
and  possible  applications  in  the  study  of  neural  networks,  intelligent  control,  and  the  general 
issue  of  learning  in  stochastic  adaptive  control.  In  [6],  we  design  a  certainty  equivalent  adap¬ 
tive  controller  and  prove  its  optimality  via  an  averaging  method.  In  [7],  we  have  investigated 


3 


the  same  problem,  but  by  employing  a  different  adaptive  control  law,  known  as  Nonstationary 
Value  Iteration  (NVI).  NVI,  instead  of  computing  the  optimal  policy  for  each  value  of  the  pa¬ 
rameter  and  storing  it  (as  with  certainty  equivalent  policies),  computes  the  control  law  on-line 
by  performing  one  step  of  a  dynamic  programming  algorithm  at  each  time,  using  the  most 
recent  parameter  estimate.  Again,  we  show  in  [7]  the  optimality  of  this  policy  for  this  class 
of  problems.  A  more  general  methodology  for  adaptive  control  of  finite  state  Markov  chains 
with  incomplete  state  observations  is  presented  in  [15],  [17]. 


1.2  Discrete  Event  Dynamical  Systems 

Motivated  by  appfications  in  hierarchical  intelligent  control  in  which  we  envision  an  ar¬ 
chitecture  with  discrete  event  controllers  at  a  high  level  interacting  with  continuous  systems 
and  controllers  at  lower  levels,  we  have  undertaken  a  significant  program  of  research  in  Dis¬ 
crete  Event  Dynamical  Systems  (DEDS).  We  have  studied  the  supervisor  synthesis  problem  of 
DEDS  through  the  use  of  synchronous  composition  of  the  plant  and  supervisor,  thus  simplify¬ 
ing  the  DEDS  control  methodology.  Stability  and  stabilization  of  DEDS  are  studied  in  [8],  [25]; 
these  notions  are  presented  in  a  more  general  setting  than  in  previous  work.  EfiScient  tests  for 
stability  and  stabilizability  are  derived.  We  address  in  [9]  the  supervisory  synthesis  problem 
for  controlling  the  sequential  (infinite  string)  behaviors  of  DEDS  under  complete  as  well  as 
partial  information  through  the  use  of  synchronous  composition.  Closed  form  expressions  for 
supremal  languages  are  obtained,  and  supervisors  are  designed.  In  [10],  [16]  we  take  treat  the 
state  space  of  the  DEDS,  as  opposed  to  the  set  of  events,  as  the  fundamental  concept.  We 
approach  the  problem  of  controlling  (possibly  infinite  state)  DEDS  by  using  predicates  and 
predicate  transformers.  The  supervisory  predicate  control  problem  is  introduced  and  solved. 
The  problem  of  controlling  DEDS  under  incomplete  state  observations  is  also  considered  and 
solved.  Techniques  for  finding  extremal  solutions  of  boolean  equations  are  used  to  derive 
minimally  restrictive  supervisors. 

Many  systems  such  as  manufacturing  systems,  database  management  systems,  communi¬ 
cation  networks,  etc.  can  be  modeled  as  input-output  discrete  event  systems  (I/O  DEDS).  In 
[13]  we  formulate  and  study  the  problem  of  stable  realization  of  such  systems.  Given  an  input 
and  an  output  language  describing  the  sequences  of  events  that  occur  at  the  input  and  the 
output,  respectively,  of  an  1/ 0  DEDS,  we  study  whether  it  is  possible  to  realize  the  system  as  a 
unit  consisting  of  a  given  set  of  buffers  of  finite  capacity,  called  a  dispatching  unit..  Effectively 
computable  necessary  and  sufficient  conditions  for  testing  for  stable  and  causal  input-output 
maps  are  obtained. 
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